- Convnets continued
- gives scores on every category on every window in the input
- non-max suppression
- Choose the one with the highest score
- Or neighbors that made the same decision
- CV course
- need to have a consistent interpretation of the scores
- can take the shortest path through the sliding windows to minimize cost and identify the right string
- Works well for detection
- Collect datasets of images with & without faces
- Train convnet to return true if there is a face or if there isn't face
- Will have a lot of false positives
- training set probably doesn't have enough non-faces
- go for a large collection of non-faces
- add the ones where the detector fires, and add them to the negative set
- repeat
- and size variation
- Can scale image to handle different face sizes
- Semantic Segmentation
- Assign a category to every pixel
- Used stereo vision to get labeled data
- Train convnet to predict labels
- Stereo only works up to 10m
- Can use neural network + stereo for fast responses
- Semantic segme